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Abstract. The coat proteins of many viruses spontaneously form icosahedral capsids 
around nucleic acids or other polymers. Elucidating the role of the packaged polymer 
in capsid formation could promote biomedical efforts to block viral replication and 
enable use of capsids in nanomaterials applications. To this end, we perform Brownian 
dynamics on a coarse-grained model that describes the dynamics of icosahedral capsid 
assembly around a flexible polymer. We identify several mechanisms by which the 
polymer plays an active role in its encapsulation, including cooperative polymer-protein 
motions. These mechanisms are related to experimentally controllable parameters such 
as polymer length, protein concentration, and solution conditions. Furthermore, the 
simulations demonstrate that assembly mechanisms are correlated to encapsulation 
efficiency, and we present a phase diagram that predicts assembly outcomes as a 
function of experimental parameters. We anticipate that our simulation results will 
provide a framework for designing in vitro assembly experiments on single-stranded 
RNA virus capsids. 
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1. Introduction 

During the replication of many viruses with single-stranded RNA (ssRNA) genomes, 
hundreds to thousands of protein subunits spontaneously assemble around the viral 
nucleic acid to form an icosahedral protein shell, or capsid. Understanding the 
factors that confer robustness to this cooperative multicomponent assembly process 
would advance technologies that exploit capsids as drug delivery vehicles or imaging 
agents [H [21 HI [5l |6l [7], and could establish principles for the design of synthetic 
containers with controllable assembly or disassembly. Furthermore, numerous human 
pathogenic viruses have ssRNA genomes, and understanding how nucleic acid properties 
promote capsid assembly could spur the development of antiviral drugs that block viral 
replication. The nucleic acid cargo is essential for assembly, since ssRNA viral proteins 
require RNA (or other polyanions P[9l[ini[IIl[l2l[l3l[H[l5l[l6l[ia 
to assemble at physiological conditions. However, the role of the packaged polymer is 
poorly understood because assembly intermediates are transient and thus challenging to 
characterize with experiments. Therefore, this article considers dynamical simulations 
of a model for icosahedral capsid assembly around a flexible polymer, which result in 
experimentally testable predictions for the morphologies and yields of assembly products 
as functions of polymer length and solution conditions. Furthermore, the simulations 
demonstrate that, depending on solution conditions and the strength of interactions 
between viral proteins, assembly around a polymer can proceed by signiflcantly different 
mechanisms. How the interactions among viral components control their assembly 
mechanisms and products is a fundamental question of physical virology. 

Performing atomistic simulations of the complete dynamics of a capsid assembling 
around its genome is not computationally feasible [2T]. However, experimental model 
systems in which capsid proteins assemble into icosahedral capsids around synthetic 
polyelectrolytes [El El [ISl [HI [IE] , charge-functionahzed nanoparticles [TOi [TT[ [T2[ [T31IT41 
[16j, and nano-emulsions [20] demonstrate that properties speciflc to nucleic acids are 
not required for capsid formation or cargo packaging. Therefore, in this article we strive 
for general conclusions about the assembly of an icosahedral shell around a polymer 
by considering a simplifled geometric model, inspired by previous simulations of empty 
capsid assembly [22l[23]. The model employs trimeric protein subunits, represented as 
rigid triangular bodies, with short ranged attractions arranged so that an icosahedron 
is the lowest energy state. The subunits experience short range attractive interactions 
(representing the effect of screened electrostatics) with a flexible polymer, and assembly 
is simulated with Brownian dynamics. 

By taking advantage of their high degrees of symmetry and structural regularity, 
the structures of virus capsids assembled around single-stranded nucleic acids have been 
revealed by x-ray crystallography and/or cryo-electron microscopy (cryo-EM) images 
(e.g.[2ll ESlESlEZlEHlESlEOlSIlEa E^). The packaged nucleic 

acids are less ordered than their protein containers and hence have been more difficult to 
characterize. However cryo-EM experiments have identified that the nucleotide densities 
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are nonuniform, with a peak near the inner capsid surface and relatively low densities 
in the interior [271 EHl EH]. For some viruses striking image reconstructions show that 
the packaged RNA adopts the symmetry of its protein capsid (e.g. [EZl [30l [37] ) . While 
atomistic detail has not been possible in these experiments, all- atom models have been 
derived from equilibrium simulations [211 SO]- Furthermore, a number of equilibrium 
calculations have analyzed the electrostatics of packaging a polyelectrolyte inside a 

capsid [iiiiiasoisaiiassiiisii^ 

Despite these structural studies and equilibrium calculations, the kinetic pathways 
by which capsid proteins assemble around their genome or other cargoes remain 
incompletely understood. An in vitro experiment on assembly of cowpea chlorotic mottle 
virus (CCMV) [26j demonstrated different kinetics than for assembly of capsid proteins 
alone. The results suggested protein- RNA complexes as important intermediates and 
showed that the relative concentrations of protein and RNA affect assembly mechanisms. 
However, the structures of intermediates and the specific assembly mechanisms could 
not be resolved. Recently several groups have begun to overcome this limitation by 
characterizing assembly intermediates using mass spectrometry (e.g. [521 [5311291 [30115^ ). 
Stockley and coworkers j29l |30l [54] performed a remarkable series of experiments on 
MS2 that, along with a computational study [55j, provide strong evidence that RNA 
binding allosterically mediates conformational changes that dictate capsid morphologies. 
However, many assembly intermediates and thus the complete assembly pathways 
could not be resolved. Furthermore, while experiments have examined the relationship 
between solution conditions and assembly morphologies for CCMV ^56l ?,[57], the effect 
of the properties of the nucleic acid cargo, such as its length and interactions with the 
capsid proteins, on capsid assembly morphologies has received only limited exploration 
(e.g. P El [Ml El [28]). 

Theoretical or computational modeling therefore can play an important role in 
understanding the dynamics of capsid assembly around a polymer and the relationship 
between polymer properties and the structures that emerge from assembly. Several 
previous modeling efforts have postulated roles of the RNA in the formation of 
icosahedral geometries [29[ l58J and in enhancing assembly rates ^59], but the final 
structure and assembly pathways were pre- assumed. Recently our group [60] explored 
capsid assembly around a flexible polymer with a model defined on a cubic lattice, 
which allowed simulation of large capsid-like cuboidal shells over long time scales. By 
simulating assembly with a wide range of capsid sizes and polymer lengths, we found 
that there is an optimal polymer length which maximizes encapsulation yields at finite 
observation times. The optimal length scales with the number of attractive sites on the 
capsid, unless there are attractions between polymer segments. 

In this article, we perform dynamical simulations on the encapsulation of a flexible 
polymer by a model capsid with icosahedral symmetry, which enables the predicted 
assembly products to be directly compared to experimentally observed morphologies. 
Depending on polymer length and solution conditions, the simulations predict assembly 
morphologies that include the polymer completely encapsulated by the icosahedral 
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capsid or non-icosahedral capsules, and several forms of disordered assemblages that 
fail to completely enclose the polymer. Furthermore, we are able to determine the 
importance of cooperative subunit-polymer motions, which were poorly supported by 
the single particle Monte Carlo moves used in [60j . 

We find that the relationships between polymer length, interaction strengths, and 
assembly yields are qualitatively similar to Ref. [60], but that a different assembly 
mechanism emerges when the interactions between capsid subunits are very weak and 
interactions with the polymer are relatively strong. In this mechanism, first hypothesized 
by McPherson [6l] and later by Refs. [62l |44j, a large number of subunits bind to 
the polymer in a disordered fashion, and then collectively reorient to form an ordered 
shell. This mechanism can lead to a high yield of well-formed capsids assembled 
around polymers for carefully tuned parameters, but complete polymer encapsulation 
is sensitive to changes in system parameters. Regions of parameter space that support 
the sequential assembly mechanism known for empty capsid assembly [63j are more 
robust to variations in parameters. Finally, we demonstrate that assembly yields are 
controlled by a competition between kinetics and thermodynamics by comparing the 
predictions of our dynamical simulations at finite observation times to the equilibrium 
thermodynamics for the same model. We find that the thermodynamically optimal 
polymer length is larger than the optimum found in the dynamical simulations, but 
that thermodynamics can identify the maximum polymer length at which significant 
yields are achieved in a dynamics. Understanding the relationship between kinetics 
and equilibrium predictions could be especially useful because it is possible to perform 
equilibrium calculations on models with more detail than is feasible with dynamical 
simulations (e.g. [^I^ISEaETlEHlEil^lnllTai^l^^ 

Finally, we note that the simulations in this work are meant to represent 
experimental model systems in which capsid proteins assemble around synthetic 
polyelectrolytes O [El [JT] or homopolymeric RNA. This choice was made because: 
(1) Capsids assemble around synthetic polyelectrolytes ^ 1151 [17| and nanoparticles 
[761 [TOl [T6l [T2l 113] , which demonstrates that properties specific to nucleic acids are not 
required for capsid formation or cargo packaging. (2) The tertiary structures of viral 
RNAs in solution are poorly understood [77j. Given the dearth of knowledge about 
viral RNA base pairing, we consider a simple polymer model that emphasizes universal 
aspects of capsid assembly around fiexible polymers. However, nucleic acid base pairing 
and sequence dependent interactions could have important effects on assembly pathways 
and kinetics of assembly around single-stranded RNA; some of these potential effects 
are highlighted in the context of our simulation results. 
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2. Methods 

2.1. Subunit model 

Capsid proteins typically have several hundred amino acids and assemble on time 
scales of seconds to hours. Thus, simulating the spontaneous assembly of even the 
smallest icosahedral capsid with 60 proteins is infeasible at atomic resolution [21j. 
However, it has been shown that the capsid proteins of many viruses adopt folds with 
similar excluded volume shapes, often represented as trapezoids jTH]. We thus follow 
the approach taken in recent simulations of the assembly of empty icosahedral shells 
[641 [651 [661 [23l [671 [681 [691 [701 [741 [22l 175] in which we imagine integrating over degrees of 
freedom that fluctuate on time scales much shorter than subunit collision times to arrive 
at simple model for capsid subunits in which they have an excluded volume geometry 
and orientation-dependent attractions designed such that the lowest energy structure is 
an icosahedral shell. 

Speciflcally, we consider truncated-pyramidal capsomers designed such that the 
lowest energy structure is a perfect icosahedron (flgure [lb]) . This design is similar to 
models used by Rapaport et al.^HiSZl^ and Nguyen et al.[23j in simulations of empty 
capsid assembly and could correspond to capsomers comprised of a trimer of proteins 
that form a T=l capsid. The model subunits are comprised of a set of overlapping 
spherical 'excluders' that enforce excluded volume and spherical 'attractors' with short- 
range pairwise, complementary attractions that decorate the binding interfaces of 
the subunit. Each subunit is comprised of two layers of excluders and attractors. 
Attractor positions are arranged so that complementary attractors along a subunit- 
subunit interface perfectly overlap in the ground state configuration; excluders on either 
side of the interface are separated by exactly the cut off of their potential (xc, Eq. 
Subunits have no internal degrees of freedom - they translate and rotate as rigid bodies. 

2.2. Polymer model 

We represent the polymer as a freely jointed chain of spherical monomers, with excluded 
volume that includes effects of screened electrostatic repulsions [79j. In the absence of 
any capsomer subunits, the model represents a polymer in good solvent, which behaves 

3 /5 

as a self- avoiding random walk with radius of gyration Rg = 0.21A^p^ ab, with ab the 
monomer diameter. We then add short-ranged attractions to spherical attractors on 
the interior surface of model capsid subunits that qualitatively represent the effects of 
screened electrostatic interactions between negative charges on the polyelectrolyte or 
nucleic acid and positive charges on the interior surface of capsid proteins. While these 
positive charges are found on flexible N-terminal 'ARMs' in many ssRNA viruses, our 
model was particularly motivated by the small RNA bacteriophages (e.g. MS2), in 
which the RNA or other polyanions interact with positive charges on the interior capsid 
surface. These interactions have been characterized over the past two decades through 
a series of crystal structures of MS2 capsids with different sequences of short RNA 
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(a) (b) (c) 




Figure 1: The model capsid geometry, (a) Two dimensional projection of one layer 
of a model subunit illustrating the geometry of the capsomer-capsomer pair potential, 
equation ([3]), with a particular excluder and attractor highlighted from each subunit. 
The potential is the sum over all excluder-excluder and complementary attractor- 
attractor pairs. (b) An example of a well-formed model capsid. (c) Cutaway of 
a well- formed capsid. 



hairpins (e.g. [331 [34l [35l [25l |36] ) ^^id more recently, cryo-EM images show the genomic 
RNA inside the MS2 capsid[30]. Fig. 2a shows an image of a trimer of dimers of the 
MS2 coat protein from the crystal structure highlighting the location of positive charges 
and RNA binding sites (a dimer is the fundamental subunit for MS2). Consistent with 
the overall simplicity of our model, we crudely capture the geometry of those charges 
by placing the capsid-polymer attractors as shown in Fig. 2b, Other arrangements and 
numbers of attractors sites lead to similar results; however, simulated assembly was less 
effective when distances between attractors sites were incommensurate with the ground 
state distance between polymer subunits. The comparison with MS2 is only meant to 
be suggestive, as for computational simplicity we consider a flexible homopolymer and 
we model a T=l capsid with trimers as the basic assembly unit, while MS2 has a T=3 
capsid and the dimer is the assembly unit [29j. 



2.3. Pair Interaction 

In our model, all potentials can be decomposed into pairwise interactions. Potentials 
involving capsomer subunits further decompose into pairwise interactions between their 
constituent building blocks - the excluders and attractors. The potential of capsomer 
subunit i, t/cap,i, with position R^, attractor positions {a^} and excluder positions {b^} 
is the sum of the a capsomer-capsomer part, Ucc^ and a capsomer-polymer part Ucp'- 

t^cap,i = ^cc(Ri,{bi},{a^},R^-,{a^},{b^}) + ^ [/cp(Ri, {bj, {aj, R/e), (1) 

cap j/i poly k 

where the flrst sum is over all capsomers other than i and the second sum is over 
all polymer segments. Similarly the potential of a polymer subunit i is the sum of a 
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Figure 2: (a) Image of a trimer of dimers of the MS2 coat protein [25j, which was 
generated from the crystal structure PDBID:1ZDH[25J using VMD |80j. The three 
proteins of the crystal structure asymmetric unit are shown along with the three 
symmetry-related subunits that complete the dimer subunits. The protein atoms are 
shown in van der Waals representation, RNA-stem loops are drawn in cartoon format 
and colored green, and positive charges on the proteins are colored blue, (b) The 
arrangement of polymer attractors on the model capsid subunit, as viewed from inside 
the capsid. The capsomer-polymer attractors are colored blue and the capsomer- 
capsomer attractors are colored green. (c) A cutaway view of a snapshot of a 
polymer with A^p = 200 segments encapsulated in a well-formed model capsid. Polymer 
subunits and capsomer-attractors are colored according to their interaction energy: red 
for non-interacting, green for optimal interaction and a gradient for intermediate states. 



capsomer-polymer term, [^p, and a polymer-polymer term, [/pp: 

Upo\y,i = ^ Uppi^i, Rj) + ^ UcpiHi, Rfc, {b/e}, {a/e}) (2) 

poly j/i cap k 

where R, {a} and {b} are defined as before. The capsomer-capsomer potential Ucc 
is the sum of a repulsive potential between every pair of excluders and an attractive 
interaction between complementary attractors: 

[/,e(R„{a,},{b,},R„{b,},{a,})= (|R, + bf - R^- - bjl , 2^^^, a^) 

k,i 

+ 5^Xit/^ccA (|Rz + Si- - R, - aj-l - 2^/Va, 4aa, is) 
k,l 

where Scc is an adjustable parameter setting the strength of the capsomer-capsomer 
attraction at each attract or site, Ab and Aa are the number of excluders and attractors 
respectively, ab and ^a are the diameters of the excluders and attractors, which are set 
to 1.0 and 0.20 respectively throughout this work, bf (af ) is the body-centered location 
of the k^^ excluder (attractor) on the ith subunit, Xki is 1 if attractors k and / are 
overlapping in a completed capsid (Figure lb) and otherwise. The function Cp is 
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defined as a truncated Lennard- Jones-like potential: 

: otherwise 

The capsomer-polymer interaction is defined identically to the capsomer attractor 
potential. For capsomer i with position R^, attractor positions {a^}, excluder positions 
{b^} and polymer subunit j with position Rj, the potential is: 

[/ep(R„ {b,}, {a,}, R,) = A (|R. + b,^ - R, I, 2i/Vbp, abp) 

k 

+ J]efc5ep/:8(|R. + a,^-R,|+2i/Vp,4ap,ap) (5) 

k 

_ 1 / 

CThp = 2 V^b + CTpj 

where Scp is an adjustable parameter setting the strength of the capsomer-polymer 
attraction at each attractor site, is the diameter of a polymer subunit which is set 
to 0.4crb throughout this work and is 1 if attractor k is one of the three central 



polymer attractors on the subunit (see figure 2b), 1/2 if A; is one of the three outermost 
polymer attractors and otherwise. The factor of 1/2 for the outer polymer attractor 
compensates for the fact that in the ground state of the capsid, each such attractor 
will overlap with an outer attractor from across the capsomer-capsomer interface. 
Finally, the polymer-polymer subunit interaction is broken into bonded and non-bonded 
components, where the bonded interactions are only evaluated for monomers occupying 
adjacent positions along the polymer chain: 

/:8(i?,„2VVp,ap) 



R^J < 2l/Vp 

R,j > 2i/Vp & {ij} bonded (6) 
Rij > 2^/'^(7p & {hj} nonbonded 
where Rij = |R^ — Rj| is the center-to-center distance between the polymer subunits. 



[/pp(R„ R,) = <; /:8(2^/Vp - 2VVp, ap) 




2.4' Length Scales 

Based on the size of a typical T=l capsid we can assign a value to the simulation unit 
of length (7b. Choosing satellite tobacco mosaic virus with outer radius 9.1 nm [81 J 
gives cTb ^ 2.36 nm and the edge length of our triangular subunits as ^ 7 nm and 
(7a = 0.2crb ^ 0.5nm as the range of the individual capsomer-capsomer attractors. 
One polymer segment, with diameter = 0.4crb, could represents about 3 base 
pairs of homopolymeric ssRNA and our statistical segment length is 1.5 times that 
of ssRNA. Finally, we will present subunit bath concentrations as Cq with units a^^; 
the approximate experimental concentration corresponding to our simulations is thus 
Cexp ^ 1.25 X lO^Co /xM, according to which we sample from concentrations of 80 to 500 
fjM. It is important to note, however, that results from this highly simplified model 
should only be taken to be qualitative and that these length scales, in particular the 
mapping to concentration, merely serve to identify orders of magnitude. 



Encapsulation of a polymer by an icosahedral virus 



9 



2.5. Dynamics simulations 

We evolve particle positions and orientations from random non-overlapping initial 
positions with over-damped Brownian dynamics using a second order predictor-corrector 
algorithm f82l [83] . The capsomer subunits have anisotropic translational and rotational 
diffusion constants calculated using HydrosubT.C^j. To represent an experiment with 
excess capsid protein, the system is coupled to a bulk solution with concentration cq by 
performing grand canonical Monte Carlo moves in which subunits more than lOab from 
the polymer are exchanged with a reservoir at fixed chemical potential with a frequency 
consistent with the diffusion limited rate While it is beyond the scope of this 
manuscript to consider other protein-polymer stoichiometrics, the effect of stoichiometry 
on polymer encapsulation is analyzed with an equilibrium theory in Ref. [85j, and 
the effects of stoichiometry on the equilibrium and kinetics of the encapsulation of 
nanoparticles is discussed in Ref. [86j. To mimic a bulk system periodic boundary 
conditions are employed with the box side length 40(7b. 

2. 6. Equilibrium calculation of the driving force for polymer encapsulation. 

To determine the thermodynamic driving force for encapsulation of the polymer in this 
model, we compute the difference in chemical potential between a free polymer and 
a polymer encapsulated in a perfect capsid. By computing this chemical potential 
difference as a function of polymer length, we identify the polymer length that is 
thermodynamically optimal for packaging. Specifically, we implemented an off-lattice 
version of the procedure outlined by Kumar et al. [87] for calculating the residual 
chemical potential /Xr of a polymeric chain: 

- /5/Xr(A^p) = - /3 {/Xehain(A^p + 1) " Mehain(A^p)} 

= log(exp(-/3[/i(iVp))) (7) 

where A^p is the number of segments in the chain and Ui is the interaction energy 
experienced by a test (ghost) segment added to either end of the chain with a random 
position. The angle brackets in equation [7| refer to an equilibrium average over 
configurations of the chain with A^p segments and positions of the test segment. Due to 
the potential between bonded polymer subunits, equation ([6]), importance sampling 
was required for the average to be computationally feasible. The positions of the 
inserted particles were chosen such that the distance from the test particle to its 
bonded partner on the chain is drawn from a normal distribution with mean 
and standard deviation 0.25(7p, truncated at 0.75crp. The effect of the biased insertion 
locations was removed a posteriori according to the standard formula for non-Boltzmann 
samplinglHHlEn]. 

Once the calculation of the average test particle energy was completed for a 
particular value of A^p, the polymer length was increased by one segment and the 
calculation was repeated. At each value of A^p, 10^ test insertions were performed 
interleaved with 10^ dynamics steps for 50 independent trials. Each calculation began at 
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A^p = 1. To calculate the difference in chemical potentials between free and encapsulated 
polymers, the procedure was performed for an isolated polymer as well as polymers 
inside capsids. For the latter calculations, the polymer subunit was started inside a 
well-formed empty capsid. To enhance computational feasibility, the capsid subunit 
positions were not relaxed during the calculation. 



3. Results 



To understand the influence of polymer properties on capsid assembly, we performed 
simulations for a range of polymer lengths A^p, polymer-subunit interaction energies Scp^ 
capsid subunit-subunit binding energies ^cc, and free subunit concentrations Cq. The 
parameters Scc and Scp could be experimentally controlled by varying solution pYL or 
ionic strength [90l [9l] . 



3.1. Kinetic Phase Diagram 



We begin by considering assembly outcomes at the observation time tobs = 2 x lO^to, 
which is long enough that assembly outcomes do not vary signiflcantly with time except 
at short polymer lengths, but is not suflicient to equilibrate kinetic traps if there are large 
activation barriers. Results are shown for logco = —7.38, which maps to ^ 80 /xM (see 
2.4) and Scc = 4.0/cBr. Recalling that Scc is the energy per attractor this value 



section 



may seem like a large binding energy, but the short-ranged and stereospeciflc subunit- 
subunit interactions involve a large entropy penalty [65l |92l and dimerization is 
unfavorable free energetically, with a dissociation constant K^^ = 1 mM (see Appendix 



[B| ) . A rough estimate of the free energy per subunit in a complete capsid for this binding 
energy is S'capsid ^ — 9.2/cBr. Spontaneous assembly of empty capsids at this subunit 
concentration requires Scc ^ 5.0/cbT or free energy per subunit S'capsid ^ — 14.5/cbT, 
which is consistent with experimental values at which empty capsids assemble ( e.g. 

[SQlEilES]). 

Fig. |3a|is a 'kinetic phase diagram', showing the dominant assembly outcome as a 
function of A^p and ^cp (figure 3b) at tobs- There is a single region of polymer lengths and 
interaction strengths in which most polymers are completely encapsulated in well-formed 
capsids (defined in section 2.1 and figure lb). For the remainder of this article, we will 
refer to complete encapsulation in a well-formed capsid as 'successful' assembly. Within 
this region there are optimal polymer lengths and values of Sep for which the fraction of 
trajectories ending in success is nearly 100% (figure CI, Appendix C). Notably, polymers 
that are much larger than the capsid before packaging are successfully encapsulated: the 
eflFective capsid inner radius is 2.33ab, while high success fractions are found for A^p = 230 
with unpackaged radius of gyration i?g = 5.49ab and the longest successfully packaged 
polymer had A^p = 300 and Rg = 6.43crb. This result is consistent with the experimental 
observation that polystyrene sulfonate molecules with radii of gyration larger than capsid 
size were encapsulated in cowpea chlorotic mottle virus capsids P, [17] . 
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Figure 3: Kinetic phase diagram showing the dominant assembly product as a function 
of A^p and Scp for s^c = 4.0 and logco = -7.38 at observation time tobs = 2 x lO^to- 
The legend on the right shows snapshots from simulations that typify each dominant 
configuration. Data points indicate the majority outcome, except for the 'malformed' 
and 'mixture' points. For malformed points there was a plurality of malformed capsids 
and a majority of malformed plus well-formed capsids. For points labeled 'mixed phase' 
there was no clear plurality. The exact proportions of the outcomes are available in figure 
CTl [Appendix CI Data points correspond to 20 independent assembly trajectories. 



As the polymer length or deviate from their optimal values, successful 
encapsulation yields are reduced by several failure modes. Polymers that are short 
enough to become completely adsorbed before the capsid finishes assembling tend 
to result in incomplete, but well-formed 'on-pathway' capsids for moderate binding 
energies Scc- As discussed below, assembly slows dramatically after the polymer is 
completely encapsulated because the polymer plays both thermodynamic and kinetic 
roles in enhancing assembly kinetics. We note that if the assumption of infinite dilution 
of polymers is relaxed, capsids could assemble around multiple short polymers. 

As ^cp or A^p are increased past their optimal values several forms of thermodynamic 
or kinetic traps hinder encapsulation, hence, weaker subunit-polymer interactions 
enable packaging of longer polymers. There is a similar nonmonotonic dependence of 
encapsulation yields with respect to binding energies Scc or the free subunit concentration 
(Fig. [ok below). These observations are consistent with the results of Kivenson et al. 
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[60] . suggesting that the dependence of assembly outcomes on system parameters does 
not depend strongly on subunit or capsid geometries. However, the present model 
enables us to examine the morphologies of failure modes as a function of system 
parameter values and in the presence of correlated polymer-subunit motions. The 
off-pathway failure modes can be roughly separated into three categories, illustrated 



by representative snapshots in figure 3b: (1) Uncontained, in which the capsid closes 
around an incompletely encapsulated polymer. As discussed in Kivenson et al. [60j, 
uncontained configurations form when the addition of capsomer subunits and eventual 
capsid closure is fast compared to polymer incorporation; a large activation barrier 
hinders complete encapsulation of such configurations. Beyond a certain polymer length, 
uncontained configurations become thermodynamically favorable (see below). If the 
polymer is longer still (A^p > 300), the uncontained segment acts much like a free 
polymer and nucleates the assembly of a second completed capsid which results in a 
'doublet', as shown in figure 3b, For the larger values of Scp in the uncontained regime. 



both capsids can nucleate and grow simultaneously. Even longer polymer lengths can 
lead to multiplets with more than two capsids, similar to structures recently seen in 
electron microscopy images of cowpea chlorotic mottle virus (CCMV) proteins assembled 
around RNA molecules with lengths that are multiples of the CCMV genome length[96j. 
(2) Multiple large partial capsids. When multiple capsids nucleate on the same polymer 
and grow to significant size 10 or more subunits) without associating, they are rarely 
geometrically compatible for fusion. Even though adsorbed oligomers contact each other 
frequently due to polymer motions, successful merging from such a configuration is rare 
because it requires significant subunit dissociation. (3) Defective but closed capsids, 
which we refer to as 'malformed' in this work. For many combinations of large Ap 



and ^cp we observe closed shells with hexameric dislocations (figure C3) that resemble 
the closed structures found by Nguyen et al. [68] for T=l capsids, noting that we 
only consider trimeric subunits here. We also find structures in which two well-formed 



capsids share a single triangular face (see figure 3b), reminiscent of the structure of 
many geminiviruses[97j. 

3.2. Comparison to equilibrium results. 



Since the assembly outcomes in figure 3a are measured at finite observation times. 



they identify configurations that are met ast able on assembly time scales, and therefore 
relevant to in vitro experiments and viral replication in vivo. To fully understand the 
relationship between driving forces and assembly yields, it is interesting to compare these 
results to equilibrium thermodynamics. We therefore measured the chemical potential 
for a polymer encapsulated in a well-formed capsid /x^^ain ^^^^ ^ polymer 



Mchain(see scctiou 2.1). The diflFerence /i^hain ~ Mchain measures the equilibrium driving 
force to completely enclose the polymer in a well- formed capsid, and is a typical result 
of an equilibrium calculation (e.g. HH HSl HH IST]). 

The residual chemical potential diflPerence /x^^^ — /ir, which gives the change in 
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driving force upon increasing the polymer by a single segment, is shown for several 
values of ^cp in figure 4a, The thermodynamically optimal polymer length for packaging 
in a well-formed capsid, A^p,eq, corresponds to the length at which /x^^^ — /Xr = 0. In 
contrast to the kinetic results described above, we see that A^p,eq monotonically increases 
with Sep'. A^p,eq ^ 195, 220, 230 for = 3.5, 4.0, 4.5 respectively. For comparison, the 
fraction of successful dynamical assembly trajectories is shown as a function of A^p 
in figure |4b} where we see that the highest yields are obtained for the intermediate 
6:cp = 4.0. All values of Scp show a sharp decrease in yields of well-formed capsids as the 
polymer length approaches 225 ^ A^p ^ 250; the drop-off point is nearly insensitive to 
Sep (although still nonmonotonic). Interestingly, while this polymer length is close to 
the thermodynamically optimal polymer lengths it does not reproduce their dependence 
on £c 



-cp- 



As shown in Figs. 3a and CI, the uncontained failure mode is largely responsible 
for the sharp drop-off in well- formed capsid yields at large polymer lengths. While 
uncontainment can occur out-of-equilibrium if the capsid closes faster than the polymer 
is incorporated, it becomes thermodynamically favored over a well formed capsid above a 
particular polymer length. The 'uncontainable length' A^^^^ can be estimated as follows. 
The residual chemical potential difference /x^^^ — fij. is roughly for the uncontained 
portion of the polymer, so the lowest free energy figure configuration of an uncontained 
polymer would have the thermodynamically optimal length contained and the remainder 
uncontained. The uncontained configuration becomes thermodynamically favored over 
a well-formed capsid when the integrated residual chemical potential difference becomes 
larger than the capsomer-capsomer strain free energy in an uncontained configuration. 
The strain energy was measured in the simulations to be ~ 10 — 20/cbT. Neglecting 
capsomer entropy differences between well formed and uncontained configurations, 
comparison of this value with figure |4] estimates that uncontained polymers become 
thermodynamically favored at A^^^^ ^ 250 for s^p = 3.5, which is close to the drop-off 
length. Above this length simulation results show predominantly uncontained polymers 
(figure 3a). 

From figure Clc we can also see that with strong interactions {scp >= 4.5), there 
is a rise in the production of malformed capsids - larger closed structures containing 
hexameric dislocations (as in figure C3). For longer polymers, these defective structures 
compete thermodynamically with well-formed and/or uncontained configurations since 
they permit more capsomer-polymer contacts while incurring about 12 — IT/cbT of strain 
energy. Their prevalence even at moderate polymer lengths, by contrast, is a kinetic 
effect that results from the strong capsomer-polymer interactions preventing the defects 
from annealing. As discussed for empty capsid assembly in Refs. j65l [98l [751 199] - 
kinetic traps dominate in an assembly reaction when the time to add new subunits is 
short compared to the time required for partial capsids to anneal defects or 'locally 
equilibrate'. Annealing requires the disruption of favorable but imperfect interactions, 
and frequently occurs through the dissociation of improperly bound subunits (as 
discussed further in section 3.3). The annealing time therefore increases exponentially 
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with Sec and ^cp, while the subunit association time decreases with cq or (section 
|3.3D ). Thus results at a finite observation time deviate more strongly from equilibrium 
as any of these parameters is increased. A comparison of the kinetic results to an 
equilibrium calculation that considers all possible assembly products is desirable but 
beyond the scope of this work. 

Comparison to experimental lengths. Based on the length scales assigned in 
section [Z4l the optimal and maximal polymer lengths correspond to approximately 500- 
750 nucleotides, which is shorter than the 1000 nucleotide genome length of STMV. The 
optimal length could have been adjusted by adding additional attractor sites-simulation 
results suggest that the optimal polymer length is roughly linear in the number of 
attract ors in the regime that we have considered, although it depends on attractor 
spacing and eventually saturates. At this level of simplification there is not an exact 
mapping between number of charges on capsid proteins and the number of attractor 
sites, especially considering the complexities associated with changes in the amount of 
counterion condensation that occur when charged polymers adsorb onto charged capsid 
proteins. However, we did not adjust the number of attract ors because the results do 
not change qualitatively, and we did not aim for quantitative accuracy from such a 
simplified model that does not explicitly calculate electrostatics. Finally, the optimal 
length might also change if fiexible ARMs (4Tj and/or representations of base-pairing 
that lead to compact structures [851 160] are considered. 




Figure 4: a) Residual chemical potential difference between a polymer grown inside 
a well- formed capsid and a free chain, /x^^ain ~ Mchain, at indicated capsomer-polymer 
aflSnities Scp- b) The fraction of Brownian dynamics trajectories that end with a polymer 
completely encapsulated in a well-formed capsid is shown for the same capsomer-polymer 
affinities. 
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Figure 5: Two mechanisms for assembly around the polymer. (a,b) The number of 
capsomer subunits adsorbed onto the polymer (solid) and the size of the largest partial 
capsid (dashed) are shown as a function of time for (a) a trajectory with low Onani 
(the sequential assembly mechanism) and (b) a trajectory exhibiting high Onani (the en 
masse mechanism). Parameters are (a) A^p = 200, ^cp = 3.0, logco = —6.5, e^c = 4.5 
and (b) A^p = 150, ^cp = 4.5, logCo = —5, Scc = 3.25. (c) Snapshots from the simulation 
trajectory shown in (a) (points marked with arrows), (d) Snapshots corresponding to 
points marked with arrows in (b) showing the the mass adsorption of subunits onto the 
polymer followed by annealing of multiple intermediates and finally completion. Once 
the polymer is completely contained within the partial capsid (second to last frame), 
addition of the last subunit is relatively slow as discussed in the text. 



3.3. Assembly Mechanisms 

In this section we discuss the mechanisms of polymer encapsulation and how these 
mechanisms depend on the system control parameters. Assembly trajectories can be 
described by two modes, depending on the rate and free energy for subunits to adsorb 
to the polymer. Typical trajectories that illustrate each of these modes are shown 
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Figure 6: Contour plots of (top panels) the yield, or fraction of trajectories that end with 
well formed capsids and (bottom panels) the assembly mechanism order parameter Onani 
defined in the text. Plots are shown as functions of e^^ and logco for parameter values 



{^ec = 3.25, iVp 
(right). 



150} (left), {^ec = 4.0, A^p = 150} (center), {e. 



3.25, A^p = 200} 



in figure [5] When subunit-polymer association is slow or relatively unfavorable (figure 
[5^,c), assembly first requires nucleation of a small partial capsid on the polymer, followed 
by a growth phase in which one or a few subunits sequentially and reversibly bind to 
the partial capsid. Polymer encapsulation proceeds in concert with capsid assembly in 
this mode. In the alternative mode subunits adsorb on to the polymer en masse in a 
disordered fashion and then must cooperatively rearrange to form an ordered capsid 
(figure 5d). Assembly occurs rapidly as multiple oligomers appear and coagulate to 
form an ordered capsid. In the particular trajectory shown, the reordering of subunits 
results in the polymer contained within a capsid missing one subunit; the final subunit 
binds after a delay (see discussion of assembly rates below for further discussion). 

To classify trajectories according to these modes, we define an order parameter 
C^nanb which mcasurcs the number of subunits adsorbed onto the polymer that are not 
in the largest partial capsid, averaged over all recorded snapshots in which the largest 
assembled partial capsid has a size in the range 3 < A^iargest ^ 8. Large values of the 
order parameter Onani ^ 8 indicate that nearly enough subunits to form a capsid have 
adsorbed before significant assembly occurs (corresponding to the en masse mechanism), 
while small values Onani ^ 2 correspond to the sequential assembly mechanism. Values 
of 0nani ^rc presented as functions of the system control parameters in figure [6] (bottom 
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panels), where we see that the en masse mechanism dominates when subunit adsorption 
onto the polymer is free energetically favorable and is fast compared to capsid assembly. 
Specifically, the number of adsorbed subunits approaches or exceeds the number of 
subunits in a capsid, CiA^p > A^c, with Ci with a one-dimensional concentration of 
adsorbed but unassembled subunits and A^c the capsid size. In order to reach this limit, 
the polymer-capsid affinity and free subunit concentration must be large enough that 
the equilibrium number of adsorbed subunits reaches A^c even at Scc = 0^ or c^^N^ ^ A'c- 
Furthermore, subunit adsorption must approach this equilibrium value faster than the 
capsid nucleation time Tnuc, so that assembly does not deplete Ci. Since nucleation times 
decrease with increasing concentration and binding energy as Tnuc ^ c^^'''''' exp~^^^ [60j 
(see . 



Appendix A), these conditions are only met for relatively low binding energies Sc 



can be seen by comparing figure|6]with the values of c^^ shown in figure C2, Furthermore, 
low binding energies facilitate annealing of imperfect geometries and the desorption of 
subunits from partial capsids and/or the polymer, which are essential elements of the 



en masse mechanism. As evident in Fig. 5b, it is common for the number of adsorbed 



subunits to exceed the number in a complete capsid; the excess subunits must unbind 
before the polymer can be completely encapsulated. Similarly, the en masse mechanism 
frequently involves the association of large oligomers, which often result in imperfect 
binding geometries. Annealing of imperfect geometries can occur via rearrangement, 
but typically involves the dissociation of some subunits. 

To learn how assembly mechanisms correlate to polymer encapsulation efficiency, 
we also present the fraction of successful assembly trajectories in figure |6] (top panels). 
We first consider the relatively short polymer length A^p = 150, for which there are 
more interaction sites than polymer segments, and the extremely low binding energy 
Sec = 3.25 (we did not observe significant yields of assembled capsids with < 3 
for any parameter sets). For these parameters, assembly yields increase with until 
high values of Cq and Scp^ and significant yields occur only for parameters in which 
the en masse mechanism dominates. The latter result can be understood by noting 
that the Scc = 3.25 corresponds to a large critical nucleus and a large critical subunit 
concentration and thus no assembly occurs without a high value of Ci. In contrast, for 
£cc = 4 significant packaging efl&ciencies are found only when the sequential mechanism 
dominates. As noted in the previous paragraph, extremely high cq is required to achieve 
subunit adsorption rates that are fast compared to assembly time scales at this binding 
energy. Assembly is not efficient at those concentrations because of kinetic traps. 

A similar dependence of packaging efficiencies on e^p and Cq is found for longer 
polymer lengths (e.g. A^p = 200 in the right panel of figure [6]), except that packaging 
becomes less successful with increasing in the en masse region even at low Scc- This 
trend occurs because mass adsorption onto the longer polymer frequently results in 
multiple nuclei that are unable to simultaneously anneal and encapsulate the polymer 
and instead yield disordered aggregates, as shown in figure [04} 
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Figure 7: a) The median growth times (time between nucleation and completion) as 
a function of A^p for indicated values of Scp^ with Scc = 4.0 and logco = —7.38. b) 
Snapshots from an assembly trajectory demonstrating both sliding, or one-dimensional 
diffusion of subunits along the polymer, and the 'fly-casting' mechanism described in 
the text. A free subunit binds the polymer (flrst frame) and slides towards the growing 
edge (second and third frame). It then binds to the growing edge of the capsid (fourth 
frame) while still attached to the polymer, forming a small loop. Note that fly-casting 
is not limited to such short loops. 



3.4' The polymer enhances assembly rates. 

In addition to affecting assembly outcomes, properties of the polymer have a dramatic 
effect on assembly timescales. The polymer signiflcantly lowers the free energy barrier 
for nucleation by stabilizing pre- nucleated partial capsid intermediates, and as discussed 
next can increase subunit association rates before and after nucleation. The effect of 



the polymer on nucleation rates is described in Appendix A and in Ref. 

To quantify the effect of the polymer on rates of growth after nucleation, we 
measured growth times, or the times between nucleation and completion, for individual 



capsids. As shown in Fig. 7a, the median growth time decreases with polymer length 
for all interaction parameters until reaching a parameter-independent limiting value at 
approximately A^p = 200. This trend reflects several mechanisms by which the polymer 
can influence capsid growth. First, as noted in [£0j binding to the polymer stabilizes 
partial-capsid intermediates; this is a thermodynamic effect that increases the net rate 
of assembly by decreasing the rate of subunit desorption from adsorbed intermediates. 
This effect is particularly important for the conditions we study, where empty capsids do 
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not form spontaneously in the absence of a polymer. Under these conditions assembly 
slows significantly once the polymer is completely adsorbed in a partial capsid, resulting 
in the on-pat hway incomplete capsids discussed in section |3?T] for short polymers. The 
effect of increasing s^p on growth rates saturates when the unbinding rates of polymer- 
stabilized subunits become small compared to association rates. 

The polymer also enhances growth rates by increasing the fiux of subunits to 
and from the assembling partial capsid. Subunit fiux is enhanced by (at least) two 
mechanisms: (1) correlated polymer-subunit motions drag adsorbed subunits to/from 
binding sites (i.e. the polymer acts like a fiy-caster or the Cookie Monster), and 
(2) adsorbed subunits undergo effectively one-dimensional diffusion (sliding) along the 
polymer \59l [60]. While sliding was examined in [60j, correlated polymer-subunit 
motions were not well represented by the single particle Monte Carlo moves used in 
that work. We find that both mechanisms occur in the simulations discussed here; 
examples can be seen in figure [7b| For the parameters and model geometries that we 
use here, correlated polymer-subunit motions are more productive than sliding, and 
become more important as Ci increases; the en masse assembly mechanism described 
above is essentially the extreme limit of correlated polymer-subunit motions at high Ci . 
The fiux-enhancement increases with polymer length and Sep until the rate of transfer 
of subunits from the polymer to capsid binding sites becomes rate-limiting. 

Completion phase. The effect of the polymer on subunit association rates leads 
to a complicated dependence of growth rates on the partial capsid size and system 
parameters, as illustrated by the two trajectories shown in figure [5| In general, net 
growth rates slow as the partial capsid nears completion because fewer potential binding 
sites remain available and because the rate at which the polymer captures free subunits 
diminishes as it is progressively contained. This trend can be seen in the sequential 



assembly trajectory shown in figures [5a| and [5c| However, because the polymer is 
relatively long A^p = 200 and the capsomer-polymer affinity is relatively weak < 3.5, 
the polymer makes frequent excursions outside of the partial capsid at all sizes and 
continues to enhance the subunit fiux until the final subunit is in place. In contrast, 
the trajectory with high Onani (figures 5b & 5d) exhibits rapid growth during the 
rearrangement of adsorbed subunits, but stalls when the polymer becomes completely 
encapsulated within the capsid missing a single subunit (fourth frame). In this case 
with a shorter polymer A^p = 150 and stronger capsomer-polymer affinity ^cp = 4.5 
the polymer remains completely incorporated and plays no role in attracting the final 
subunit. As a result, insertion of the final subunit is slow compared to the rest of the 
assembly process. 

We note that the effect of polymer incorporation on the rate of insertion of the 
last subunit can be significant, since for empty capsid assembly the subunit addition 
rate decreases somewhat as the capsid nears completion. In our model the last subunit 
associates on average ^4 times more slowly than those added when the partial capsid is 
half complete. Unlike the model studied in Nguyen et al.[23j, however, insertion of the 
final subunit is free energetically favorable, and is not rate limiting under reasonable 



Encapsulation of a polymer by an icosahedral virus 



20 



conditions. 



3.5. Polymer order 

Consistent with experiments (e.g. [27ll30l[37j) and the equihbrium calculation of Forrey 
et al. j43] the polymer adopts the symmetry of its capsid, as shown in figure C5, The 
polymer order arises as a simple consequence of the symmetric arrangement of low free 
energy sites on the interior capsid surface. To obtain the images in figure C5, we 
discretized space, and colored each bin with an intensity proportional to the log of the 
local polymer density p. In order for the high-density regions to be visible, bins with 
logp/ logpmax < 0.25, with pmax the maximum density, were rendered invisible. 



4. Conclusions 



In summary, the calculations in this work show that subunits equipped with interactions 
driving the formation of an icosahedral shell can assemble into a rich array of structures 
around a polymer. The nature of the assembly products can be tuned by changing 
experimentally controllable parameters, such as polymer length, solution conditions, and 
protein concentrations. Furthermore, the mechanism by which assembly takes place can 
be systematically varied from a sequential process resembling empty capsid assembly to 
an en masse process in which subunits rapidly adsorb and then collectively rearrange 
into an ordered capsid. 

The simulations indicate that the en masse mechanism occurs only when the 
subunit-subunit binding energy is much weaker than that required for empty capsid 
assembly and there is a strong driving force for subunit absorption onto the polymer. 
These criteria are met by many single-stranded RNA viruses at physiological conditions, 
for which protein-protein interactions are too weak to drive empty capsid assembly 
[90] and there are strong electrostatic interactions between the nucleic acid and capsid 
subunits. In particular, Brome mosaic virions have been described as 'loose assemblies' 
which cannot maintain structural integrity without protein-nucleic acid and protein- 
divalent cation interactions [3ll [32[ llOOi llOlj . 

Given these observations, it might be surprising that the simulations predict that 
assembly via the en masse mechanism is less robust than the sequential assembly 
mechanism, in the sense that high yields of polymers completely encapsulated in 
well-formed capsids are found over smaller ranges of parameter values (e.g. compare 
Figs. [6}3 and|6]l). However, the simulations model assembly around a linear polymer, 
while secondary and tertiary interactions in RNA molecules lead to compact branched 
structures [77j. We speculate that polymer compactification due to base pairing could 
increase the robustness of the en masse mechanism, since it brings the problem closer 
to the limit of assembly around a rigid core [62] . 
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Appendix A. The effect of the polymer on nucleation times 

To understand the effect of the polymer on nucleation times, we build upon what 
is known about nucleation times for empty capsid assembly Several references 
have analyzed capsid nucleation through simplified rate equations and/or classical 
nucleation theory [721 11021 [99] . and find that nucleation times can be expressed as 
(r^^P*^)"-^ oc /cq exp(G^_i//cBT) with n the critical nucleus size, Gn-i the interaction 
free energy of the largest unstable partial capsid (the amount by which subunit-subunit 
interactions decrease the nucleation barrier), and / a rate constant. Roughly speaking, 
the concentration of intermediates just below the nucleus size is Cq""^ exp(— G^_i//cbT) 
and the rate at which a subunit associates to a pre-nucleus is /cq (a diflFerent attempt 
rate is derived under the continuum approximation of Ref. |lQ2j ). It is important to 
note that an important simplifying assumption is made in these theories, namely that 
the identity of a critical nucleus can be defined by the number of subunits alone; i.e., 
the intermediate size is a sufficient reaction coordinate. This assumption was mildly 
violated in the simulations of Ref. [60]. Furthermore, for icosahedral capsids it is likely 
that critical nuclei correspond to particular small polygons (e.g. [ 1031 11041 [54]). and 
different assembly pathways for a given virus could proceed through critical nuclei with 
different numbers of subunits [54j. 

The empty capsid nucleation picture can be extended to include a polymer by 
noting that adsorption of subunits onto the polymer affects both the free energy 
barrier and the attempt rate. The concentration of pre- nuclei on the polymer can be 
expressed as c^_i N^Cq~^ exp[-{Gn-i + cx{n^uc - l)gcp)/kBT] with g^p the polymer- 



subunit interaction free energy (see Appendix B), a the fraction of potential polymer- 



subunit contacts in a typical nucleus, and Gn-i the total partial capsid subunit-subunit 
interaction free energy. The factor A^p accounts for the fact that the number of sites 
at which a nucleus can form is linear in polymer length. The attempt rate depends 
on the rate at which adsorbed and/or free subunits associate with a polymer-bound 
partial capsid intermediate, which depends in part on the rates of correlated polymer- 
subunit motions and subunit diffusion along the polymer. If nucleation is dominated 
by association of subunits that are already adsorbed onto the polymer, the rate can be 
expressed as t~^^ ^ f'ciCn-i with f a rate constant for polymer-adsorbed subunits. This 
scaling was found to be consistent with the simulation data in Ref. [60j . In performing 
this analysis, it is important to note that the critical nucleus size n can depend on 
interaction free energies and subunit concentrations (see Refs. [1021 [99]). We have 
not performed a statistical analysis using committor probabilities [60l 1105] but for the 
conditions studied in this manuscript, critical nucleus sizes for assembly on the polymer 



appear to fall in the range 3 < nnuc % 5 (see Appendix B). 



Appendix B. Estimates of Binding Free Energies 



Capsid subunit-subunit binding free energies. In order to estimate the free 
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energy subunit-subunit binding, we performed simulations of subunits with attractors 
on only one of the three edges, so that only dimerization was possible. We measured 
the dimer-monomer dissociation constant K^^ in the absence of polymer for a range 
of subunit concentrations and binding energies Scc- The free energy for binding along 
a single interface (which involves up to six attractors on each subunit) is then given 
by gcc = —kBTln{K^^Css) where Cgg = 8a^^ is the standard state concentration that 
maps to the conventional choice of 1 M (see section 2.4). The resulting free energy 
can be expressed as gcc ^ — 3.56:cc — Tscc^ with the binding entropy Scc = — 12.4/cb. The 
binding entropy arises from rotational entropy loss and the fact that the subunit-subunit 
attraction range is smaller than the standard state length scale crb/2. We calculated 
the binding entropy analytically for a similar model in Ref. [65]; for further discussion 
also see Refs. j92l [93]. 

We can obtain an upper bound on the free energy of larger capsid structures by 
noting that the binding entropy for dimerization Scc is a lower bound for the entropy 
lost by a subunit with multiple bonds. Furthermore, the majority of the entropy is lost 
upon making the first bond, because the contacts are so stereospecific. A rough estimate 
for the free energy per subunit of a well-formed model capsid with A^c = 20 subunits 
is therefore Gcapsid ^ 3/2Ac(3.55cc) + (Ac — l)Tscc to give the free energy per subunit 
5'capsid = — 9.2/cbT at £cc = 4.0/cbT. We note that, despite the fact that forming a capsid 
is thermodynamically favorable at these parameters, capsids do not spontaneously 
assemble in our simulations until Scc = 5.0/cbT because of a large nucleation barrier - the 
smallest (weakly) favorable structure is a pentamer. For Scc = S.O/cbT our estimate gives 
5'capsid = — 14.5/cBr, which is consistent with experimental values for the free energy per 
subunit at which capsids spontaneously assemble [901 l94l l95]. 

Capsid-polymer binding free energy. We estimate the polymer-capsomer 
binding free energy by performing simulations in which the capsomer-capsomer binding 
energy is set to zero Scc = 0.0 (figure C2). We can then extract the binding free energy 
from the formulation given by McGhee and von Hippel for the binding of a ligand to a 
uniform polymer when each ligand occupies more than one binding site jl06j . We find 
that the free energy of binding for intermediates binding energies S.Oki^T < Scp < 5.0kiyT 
is given by gcp ^ — 1.96:cp — Tscp with Scp = — 7.4/cb. Again note that binding of a single 
subunit to the polymer, with = A mM, is unfavorable at the default concentration 
we consider, cq = 6.25 x lO^^a^^^ or 80/iM. 



Appendix C. Further information 
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Figure CI: The fraction of trajectories that end in each outcome are shown in cumulative 
plots as a function of A^p for = {3.5, 4.0, 4.5, 5.0}, for (a)-(d) respectively. The height 
of each color corresponds to the fraction of trajectories resulting in that outcome, color- 



coded according to the legend in figure 3b The spike in at A^p ^ 300 in (c) corresponds 
to a large yield of size 30 defective capsids, examples of which are pictured in the bottom 



row of figure C3 
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Figure C2: The driving force for subunits to adsorb on to the polymer is revealed by 
q^, the equilibrium one-dimensional concentration of subunits on the polymer in the 
absence of capsomer-capsomer attractions (^cc = 0). is measured as the average 
number of adsorbed subunits divided by the polymer length, and shown as functions of 
Sep and log Co. 




Figure C3: Examples of common malformed but closed capsids. The top row shows the 
single dominant morphology for sizes 22, 24 and 26. For sizes 24 and 26, the dislocations 
(2 in the former case, 3 in the latter) relieve strain by arranging themselves at opposite 
poles of the 2 and 3 fold symmetry axes, respectively. In the bottom row are the 3 most 
prevalent morphologies for malformed capsids of size 30, for which more strain-relieving 
arrangements of hexamers are possible. 
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Figure C5: Visualization of the polymer density. The polymer density is averaged over 
a large number of successful assembly trajectories after completion, for a polymer with 
length A^p = 150. Densities are averaged over the threefold symmetry of the capsomer, 
but not over the 20- fold symmetry group of the completed capsid. 



